10 research outputs found
leave a trace - A People Tracking System Meets Anomaly Detection
Video surveillance always had a negative connotation, among others because of
the loss of privacy and because it may not automatically increase public
safety. If it was able to detect atypical (i.e. dangerous) situations in real
time, autonomously and anonymously, this could change. A prerequisite for this
is a reliable automatic detection of possibly dangerous situations from video
data. This is done classically by object extraction and tracking. From the
derived trajectories, we then want to determine dangerous situations by
detecting atypical trajectories. However, due to ethical considerations it is
better to develop such a system on data without people being threatened or even
harmed, plus with having them know that there is such a tracking system
installed. Another important point is that these situations do not occur very
often in real, public CCTV areas and may be captured properly even less. In the
artistic project leave a trace the tracked objects, people in an atrium of a
institutional building, become actor and thus part of the installation.
Visualisation in real-time allows interaction by these actors, which in turn
creates many atypical interaction situations on which we can develop our
situation detection. The data set has evolved over three years and hence, is
huge. In this article we describe the tracking system and several approaches
for the detection of atypical trajectories
3D real time object recognition
Die Objekterkennung ist ein natürlicher Prozess im Menschlichen Gehirn. Sie ndet im visuellen Kortex statt und nutzt die binokulare Eigenschaft der Augen, die eine drei- dimensionale Interpretation von Objekten in einer Szene erlaubt. Kameras ahmen das menschliche Auge nach. Bilder von zwei Kameras, in einem Stereokamerasystem, werden von Algorithmen für eine automatische, dreidimensionale Interpretation von Objekten in einer Szene benutzt. Die Entwicklung von Hard- und Software verbessern den maschinellen Prozess der Objek- terkennung und erreicht qualitativ immer mehr die Fähigkeiten des menschlichen Gehirns. Das Hauptziel dieses Forschungsfeldes ist die Entwicklung von robusten Algorithmen für die Szeneninterpretation. Sehr viel Aufwand wurde in den letzten Jahren in der zweidimen- sionale Objekterkennung betrieben, im Gegensatz zur Forschung zur dreidimensionalen Erkennung. Im Rahmen dieser Arbeit soll demnach die dreidimensionale Objekterkennung weiterent- wickelt werden: hin zu einer besseren Interpretation und einem besseren Verstehen von sichtbarer Realität wie auch der Beziehung zwischen Objekten in einer Szene. In den letzten Jahren aufkommende low-cost Verbrauchersensoren, wie die Microsoft Kinect, generieren Farb- und Tiefendaten einer Szene, um menschenähnliche visuelle Daten zu generieren. Das Ziel hier ist zu zeigen, wie diese Daten benutzt werden können, um eine neue Klasse von dreidimensionalen Objekterkennungsalgorithmen zu entwickeln - analog zur Verarbeitung im menschlichen Gehirn.Object recognition is a natural process of the human brain performed in the visual cor- tex and relies on a binocular depth perception system that renders a three-dimensional representation of the objects in a scene. Hitherto, computer and software systems are been used to simulate the perception of three-dimensional environments with the aid of sensors to capture real-time images. In the process, such images are used as input data for further analysis and development of algorithms, an essential ingredient for simulating the complexity of human vision, so as to achieve scene interpretation for object recognition, similar to the way the human brain perceives it. The rapid pace of technological advancements in hardware and software, are continuously bringing the machine-based process for object recognition nearer to the inhuman vision prototype. The key in this eld, is the development of algorithms in order to achieve robust scene interpretation. A lot of recognisable and signi cant e ort has been successfully carried out over the years in 2D object recognition, as opposed to 3D. It is therefore, within this context and scope of this dissertation, to contribute towards the enhancement of 3D object recognition; a better interpretation and understanding of reality and the relationship between objects in a scene. Through the use and application of low-cost commodity sensors, such as Microsoft Kinect, RGB and depth data of a scene have been retrieved and manipulated in order to generate human-like visual perception data. The goal herein is to show how RGB and depth information can be utilised in order to develop a new class of 3D object recognition algorithms, analogous to the perception processed by the human brain
A Quality Evaluation of Single and Multiple Camera Calibration Approaches for an Indoor Multi Camera Tracking System
Human detection and tracking has been a prominent research area for several scientists around the globe. State of the art algorithms
have been implemented, refined and accelerated to significantly improve the detection rate and eliminate false positives. While 2D approaches
are well investigated, 3D human detection and tracking is still an unexplored research field. In both 2D/3D cases, introducing
a multi camera system could vastly expand the accuracy and confidence of the tracking process. Within this work, a quality evaluation
is performed on a multi RGB-D camera indoor tracking system for examining how camera calibration and pose can affect the quality of
human tracks in the scene, independently from the detection and tracking approach used. After performing a calibration step on every
Kinect sensor, state of the art single camera pose estimators were evaluated for checking how good the quality of the poses is estimated
using planar objects such as an ordinate chessboard. With this information, a bundle block adjustment and ICP were performed for
verifying the accuracy of the single pose estimators in a multi camera configuration system. Results have shown that single camera
estimators provide high accuracy results of less than half a pixel forcing the bundle to converge after very few iterations. In relation
to ICP, relative information between cloud pairs is more or less preserved giving a low score of fitting between concatenated pairs.
Finally, sensor calibration proved to be an essential step for achieving maximum accuracy in the generated point clouds, and therefore
in the accuracy of the produced 3D trajectories, from each sensor
Calibration of a multiple stereo and RGB-D camera system for 3D human tracking
Human Tracking in Computer Vision is a very active up-going research area. Previous works analyze this topic by applying algorithms and features extraction in 2D, while 3D tracking is quite an unexplored filed, especially concerning multi–camera systems. Our approach discussed in this paper is focused on the detection and tracking of human postures using multiple RGB–D data together with stereo cameras. We use low–cost devices, such as Microsoft Kinect and a people counter, based on a stereo system. The novelty of our technique concerns the synchronization of multiple devices and the determination of their exterior and relative orientation in space, based on a common world coordinate system. Furthermore, this is used for applying Bundle Adjustment to obtain a unique 3D scene, which is then used as a starting point for the detection and tracking of humans and extract significant metrics from the datasets acquired. In this article, the approaches are described for the determination of the exterior and absolute orientation. Subsequently, it is shown how a common point cloud is formed. Finally, some results for object detection and tracking, based on 3D point clouds, are presented
AlphaGAN: Generative adversarial networks for natural image matting
We present the first generative adversarial network (GAN) for natural image matting. Our novel generator network is trained to predict visually appealing alphas with
the addition of the adversarial loss from the discriminator that is trained to classify wellcomposited images. Further, we improve existing encoder-decoder architectures to better
deal with the spatial localization issues inherited in convolutional neural networks (CNN)
by using dilated convolutions to capture global context information without downscaling
feature maps and losing spatial information. We present state-of-the-art results on the
alphamatting online benchmark for the gradient error and give comparable results in others. Our method is particularly well suited for fine structures like hair, which is of great
importance in practical matting applications, e.g. in film/TV production
Human Recognition in RGBD Combining Object Detectors and Conditional Random Fields
This paper addresses the problem of detecting and segmenting human instances in a point cloud. Both fields have been well studied during the last decades showing impressive results, not only in accuracy but also in computational performance. With the rapid use of depth sensors, a resurgent need for improving existing state-of-the-art algorithms, integrating depth information as an additional constraint became more ostensible. Current challenges involve combining RGB and depth information for reasoning about location and spatial extend of the object of interest. We make use of an improved deformable part model algorithm, allowing to deform the individual parts across multiple scales, approximating the location of the person in the scene and a conditional random field energy function for specifying the object’s spatial extent. Our proposed energy function models up to pairwise relations defined in the RGBD domain, enforcing label consistency for regions sharing similar unary and pairwise measurements. Experimental results show that our proposed energy func- tion provides a fairly precise segmentation even when the resulting detection box is imprecise. Reasoning about the detection algorithm could potentially enhance the quality of the detection box allowing capturing the object of interest as a whole
Towards a 3D Pipeline for Monitoring and Tracking People in an Indoor Scenario using multiple RGBD Sensors
This paper addresses the problem of detecting and segmenting human instances in a point cloud. Both fields have been well studied during the last decades showing impressive results, not only in accuracy but also in computational performance. With the rapid use of depth sensors, a resurgent need for improving existing state-of-the-art algorithms, integrating depth information as an additional constraint became more ostensible. Current challenges involve combining RGB and depth information for reasoning about location and spatial extent of the object of interest. We make use of an improved deformable part model algorithm, allowing to deform the individual parts across multiple scales, approximating the location of the person in the scene and a conditional random field energy function for specifying the object’s spatial extent. Our proposed energy function models up to pairwise relations defined in the RGBD domain, enforcing label consistency for regions sharing similar unary and pairwis e measurements. Experimental results show that our proposed energy function provides a fairly precise segmentation even when the resulting detection box is imprecise. Reasoning about the detection algorithm could potentially enhance the quality of the detection box allowing capturing the object of interest as a whole
Ανάπτυξη container αναδιάταξης υλικού σε επεξεργαστή προγραμματιζόμενων λογικών πόρων αρχιτεκτονικής RISCV
Summarization: With hardware designs becoming more and more complicated in order to serve the current age demands, many hardware teams tend to use reconfig-urable hardware to test their designs before putting them to market as a cost-effective practice. These demands also make the hardware teams divided into sections to fulfill the requirements of the design and consequently, cloud services are becoming even more necessary. With those arguments in mind, this thesis provides a solution for a RISC-V deployment platform in order for developers or teams thereof, to upload and test their custom hardware base on a complete RISC-V processor.Περίληψη: Με τις σχεδιάσεις υλικού να γίνονται όλο και περισσότερο περίπλοκες έτσι ώστε να εξυπηρετήσουν τις σύγχρονες ανάγκες, πολλές ομάδες σχεδιασμού τείνουν να χρησιμοποιούν αναδιατασσόμενα συστήματα έτσι ώστε να δοκιμάσουν τις σχεδιάσεις τους πριν την διαθεσιμότητά τους στην αγορά. Αυτές οι απαιτήσεις έχουν ως αποτέλεσμα τη διάσπαση των ομάδων ανάπτυξης υλικού σε τομείς, επομένως, έχει προκύψει ανάγκη για λύσεις οι οποίες προσφέρουν πόρους για τη σχεδίαση και ανάπτυξη υλικού μέσω του νέφους. Με γνώμονα αυτές τις νέες ανάγκες, αυτή η διπλωματική παρουσιάζει μία τέτοια λύση ανάπτυξης υλικού μέσω νέφους, βασισμένη σε επεξεργαστή αρχιτεκτονικής RISC-V. Η πλατφόρμα αυτή προσφέρει τη δυνατότητα σε ομάδες ανάπτυξης υλικού να ανεβάσουν στο νέφος μία σχεδίαση υλικού και να τη δοκιμάσουν σε μία ολοκληρωμένη πλατφόρμα βασισμένη σε έναν επεξεργαστή RISC-V
SIGGRAPH 2018
This poster describes a reinterpretation of Samuel Beckett?s theatrical text Playfor virtual reality (VR). It is an aesthetic reflection on practice that follows up an a technical project description submittedto ISMAR 2017 [O?Dwyer et al. 2017]. Actors are captured in a green screen environment using free-viewpoint video (FVV) techniques,and the scene is built in a game engine, complete with binaural spatial audio and six degrees of freedom of movement. The project explores how ludic qualities in the original text help elicit the conversational and interactive specificities of the digital medium. The work affirms the potential for interactive narrative in VR, opensnew experiences of the text, and highlights the reorganisation of the author?audience dynami